Artificial Intelligence in Dental Caries Diagnosis and Detection: An Umbrella Review

ABSTRACT Background and Aim Dental caries is largely preventable, yet an important global health issue. Numerous systematic reviews have summarized the efficacy of artificial intelligence (AI) models for the diagnosis and detection of dental caries. Therefore, this umbrella review aimed to synthesize the results of systematic reviews on the application and effectiveness of AI models in diagnosing and detecting dental caries. Methods MEDLINE/PubMed, IEEE Explore, Embase, and Cochrane Database of Systematic Reviews were searched to retrieve studies. Two authors independently screened the articles based on eligibility criteria and then, appraised the included articles. The findings are summarized in tabulation form and discussed using the narrative method. Result A total of 1249 entries were identified out of which 7 were finally included. The most often employed AI algorithms were the multilayer perceptron, support vector machine (SVM), and neural networks. The algorithms were built to perform the segmentation, classification, caries detection, diagnosis, and caries prediction from several sources, including periapical radiographs, panoramic radiographs, smartphone images, bitewing radiographs, near‐infrared light transillumination images, and so forth. Convoluted neural networks (CNN) demonstrated high sensitivity, specificity, and area under the curve in the caries detection, segmentation, and classification tests. Notably, AI in conjunction with periapical and panoramic radiography images yielded better accuracy in detecting and diagnosing dental caries. Conclusion AI models, especially convolutional neural network (CNN)‐based models, have an enormous amount of potential for accurate, objective dental caries diagnosis and detection. However, ethical considerations and cautious adoption remain critical to its successful integration into routine practice.

and US$ 1.55 billion, respectively, putting a large indirect cost burden on society.In addition, the issue is compounded by the uneven distribution of dental professionals, with only 1.4% working in low-income countries (Jain et al. 2024).
Early caries detection has the potential to not only prevent invasive treatments, such as selective or stepwise caries removal, and restoration treatments but also reduce the time and cost of treatment and alleviate the strain on the healthcare system (Moharrami et al. 2024).The standard diagnostic strategy used for caries detection is visual-tactile inspection, in which early lesions, such as proximal, occlusal, pit, and fissure dental caries, are difficult to detect (Gimenez et al. 2015).Dental radiography, such as intraoral periapical radiography (IOPAs) and radiovisiography (RVGs), are common approaches in a typical clinical setting that provide a visual portrayal of the extent of the carious lesion (Muñoz-Sandoval et al. 2022).Although dental radiography is more sensitive in detecting early lesions, it has a significant proportion of false-positive or false-negative detections because it is subjective and reliant on the examiner's experience (Schwendicke et al. 2021).
Digital health technologies and tools present excellent prospects for improving the diagnosis of oral problems.One such technology, known as artificial intelligence (AI), has recently become a research hotspot and an emerging trend in clinical care (Tripathy, Mathur, and Mehta 2023), with multiple studies indicating its potential in the field of dentistry (Ahmed et al. 2023;Mertens et al. 2021;Schwendicke et al. 2021).AI is a branch of computer science that displays the traits of human behavior (Patil et al. 2022).Numerous studies have examined the efficacy of AI models, including convolutional neural networks (CNNs) and artificial neural network algorithms, for the diagnosis, classification, segmentation, and prediction of dental caries (Mertens et al. 2021;Ramos-Gomez et al. 2021;Zhu et al. 2022).The evidence from these investigations is summarized in a number of systematic reviews (Ahmed et al. 2023;Khanagar et al. 2022;Moharrami et al. 2024;Prados-Privado et al. 2020).
Although studies have shown that AI is beneficial, its use has not yet been integrated into standard dental care, and the field is still in its infancy (Patil et al. 2022).Therefore, an umbrella review is crucial to obtain more precise and thorough findings on a certain subject.The findings of this study will inform dental practitioners about the practical application of AI, enable researchers to identify gaps in knowledge, and educate future dentists about evolving concepts and practices in the dental field.Furthermore, it will assist healthcare policymakers in developing policies for responsible AI integration, ultimately leading to enhanced patient care and more efficient dental caries diagnosis.Therefore, the goal of this extensive research is to summarize the findings of systematic reviews regarding the application and effectiveness of AI models in diagnosing and detecting dental caries.

| Methodology
An umbrella review was carried out to summarize the findings of systematic reviews on the application and effectiveness of AI models in diagnosing and detecting dental caries using the following PICO elements: O (Outcome): Outcome metrics such as accuracy, sensitivity, specificity, area under curve (AUC), and so forth.
We followed the JBI's (Joanna Briggs Institute) umbrella review guidelines (JBI 2020) and PRISMA guidelines for reporting our study (Page et al. 2021).This review's protocol has been registered with PROSPERO under the registration number: CRD42023464376 (PROSPERO International Prospective Register of Systematic Reviews 2023).

| Search Strategy
Two authors (S.N. and A.M.) searched the MEDLINE/PubMed, IEEE Explore, Embase, and Cochrane Database of Systematic Reviews bibliographic databases on August 18, 2023.These databases were chosen based on their comprehensive coverage across biomedical, technological, and evidence-based peerreviewed literature.To find gray literature, we also used "Google Scholar" as a search engine.As Google Scholar retrieved many results, we only examined the first 100 results based on relevance.We also performed backward and forward reference list screening, or, more specifically, reviewed the reference lists and the citation list of included reviews, to find other studies that are pertinent to the review.
To build the search strategy, systematic reviews (SRs) that were relevant to the review were referred.The terms were selected in accordance with the target study design (a systematic review), target population (dental caries), and target intervention (AI-based techniques).The thorough search stipulation utilized for searching each database is shown in Supporting Information S1: Table S1.

| Study Eligibility Criteria
This review covered systematic reviews with or without metaanalysis that concentrated on the application and effectiveness of AI-based methods in managing dental caries without restrictions for data type (such as radiographic data, data, or clinical data), time of publication, setting, or language.Reviews that failed to demonstrate at least one of the accuracy, sensitivity, specificity, or area under the curve (AUC) measurements of classifier performance were disregarded.Scoping reviews, literature reviews, rapid reviews, criteria reviews, and other types of reviews lacking sound and consistent methodologies and critical appraisal techniques, as well as reviews where the main source of information is not original work, were also rejected.Additionally, we omitted editorials, preprints, commentaries, conference abstracts, or posters.

| Study Selection, Data Extraction, and Quality Appraisal
The selection process was divided into two stages: first, two reviewers (S.N. and A.M.) independently examined the titles and abstracts of all papers that were retrieved; and second, the two reviewers read individually the complete texts of the papers that were selected in the previous step.To extract the data carefully and methodically, we created a form pilot-tested initially on two studies.Two reviewers (S.N. and A.M.) separately retrieved data from the included reviews in a spreadsheet.
The quality of the included reviews was further evaluated separately by two reviewers (S.N. and A.M.) using the JBI Critical Appraisal Checklist for Systematic Reviews and Research Syntheses (JBI n.d.).For any discrepancies among the reviewers in all phases, settlement was performed through discussion with senior authors who are subject-matter experts.

| Data Synthesis
The data from primary investigations were summarized in many SRs, which did not offer definitive results.Additionally, the included reviews included a heterogeneity of data set types, AI classifiers, and metrics for classifier performance.We therefore used a two-pronged strategy to address this variability.Thus, we presented the range of classifier measurements of performance findings.First, we tabulated the extracted data to assemble and organize the findings from the primary research as described in the SRs.Second, we used the vote-counting method to conduct a synthesis of the retrieved data, counting and aggregating the reported findings from the included SRs to determine the overall trajectory of the evidence.Last, we used a narrative method to describe and evaluate the outcome of the available evidence.Thus, we presented the range of classifier measurements of performance findings.

| Search Results
As illustrated in Figure 1 (PRISMA diagram), our search of the literature databases retrieved a total of 1249 entries.Rayyan software was used to identify and delete 45 duplicates from the citations.After reviewing the titles and abstracts of the remaining 1204 entries, 446 papers were further eliminated.The remaining 20 studies underwent full text analysis, but 13 of them were excluded because they either were not systematic reviews, or they failed to present data on dental caries.Thus, only 7 studies that met the objective and eligibility criteria of this umbrella review were included for further analysis (Khanagar et al. 2022;Mohammad-Rahimi et al. 2022;Moharrami et al. 2024;Prados-Privado et al. 2020;Revilla-León et al. 2022;Reyes et al. 2022;Talpur et al. 2022).

| Characteristics of Included Reviews
The systematic reviews included in this study were published between 2020 and 2023, with the majority (n = 5) publishing in 2022, as shown in (Table 1).Five of these reviews were conducted by reviewers from various nations, including Canada, Pakistan, Brazil, Saudi Arabia, and Spain.Notably, two reviews, by Mohammad-Rahimi et al. (2022) and Revilla-León et al. ( 2022) were carried out by a team of researchers from multiple nations.Surprisingly, only three of the studies (Mohammad-Rahimi et al. 2022;Moharrami et al. 2024;Reyes et al. 2022) reported having a registered protocol in the PROSPERO registry.Four studies explicitly stated that they followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.On the other hand, two studies (Khanagar et al. 2022;Mohammad-Rahimi et al. 2022) indicated compliance with PRISMA-DTA (Diagnostic Test Accuracy) guidelines, while one study (Prados-Privado et al. 2020) did not state that the reporting guidelines were followed.In terms of study design, all the reviews included original studies, while two studies mentioned including conference proceedings as well (Moharrami et al. 2024;Prados-Privado et al. 2020).Three reviews limited their inclusion criteria to papers that had been published in English (Khanagar et al. 2022;Moharrami et al. 2024;Talpur et al. 2022), whereas the other three applied no language constraints (Mohammad-Rahimi et al. 2022;Prados-Privado et al. 2020;Reyes et al. 2022).One study, nevertheless, failed to clarify if it had linguistic limits (Revilla-León et al. 2022).There was a noticeable variation in the time range applied with three reviews considering research from inception (Moharrami et al. 2024;Prados-Privado et al. 2020;Reyes et al. 2022), while three restricted their coverage to papers published within the last 10-12 years (Khanagar et al. 2022;Mohammad-Rahimi et al. 2022;Talpur et al. 2022) and one study did not mention the timeline (Revilla-León et al. 2022).
The PICO format of all included studies is shown in Table 2.There was some variability across the PICO formats in studies.In terms of population, four studies considered patient radiographic image databases (Mohammad-Rahimi et al. 2022, 2022;Reyes et al. 2022;Talpur et al. 2022).The intervention included AI applications (n = 3), neural networks (n = 2), and machine learning models (n = 2).The majority of studies have used expert judgment or clinical examination/reference tests or no comparator as a comparator group.Only Talpur et al. (2022) used different machine learning techniques for the prediction of caries as a comparator group.The outcome variables reported were varied across the studies with accuracy being the most common.

| Study Search, Appraisal, and Synthesis Methods in Included SRs
Varied numbers of electronic databases were searched in the included reviews with MEDLINE/PubMed (n = 7), EMBASE (n = 5), and Scopus (n = 5) being the most common.Out of seven, five studies reported reference list checking to identify further studies.The included SRs were critically assessed for methodological quality by all reviews using six different tools, such as a Cochrane risk of bias assessment tool (n = 1), JBI Critical Appraisal Checklist for Quasi-Experimental Studies (n = 1), and a revised tool for Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) (n = 4).Talpur et al. (2022) conducted the quality assessment, however, did not mention the name of the standard tool used.All reviews synthesized the data using a narrative approach due to wide heterogeneity across the AI model/algorithms used in the primary studies.2020) used descriptive measures, such as mean, standard deviation (SD), median, and percentage for data synthesis (Supporting Information S1: Table S2).

| Search Results and Data Set Features in Included SRs
The range of retrieved studies in the included reviews was 133-3410, whereas the range of included studies was 12-42.In the included studies, the size of the data sets utilized for training and validating the algorithms varied from 32 to 12,600 data sets (Supporting Information S1: Table S3).The included studies used a variety of dental images for training and validating the models, including periapical radiographs (n = 4), bitewing radiographs (n = 6), near-infrared light transillumination (NILT) images (n = 5), intraoral or oral photographs (n = 4), panoramic radiographs (n = 3), smartphone photos (n = 2), optical coherence tomography (OCT) images (n = 2), cone-beam computed tomography (CBCT) images (n = 2), and medical records/data sets (n = 2) (Figure 2).

| Summary of the Performance Metrics of the Algorithms
Six   3).

| Quality Appraisal Findings
All included reviews consistently and precisely described their review questions, inclusion criteria, sources of search, and appropriately synthesized data.However, only four studies showed a clear and comprehensive search strategy that included all pertinent search terms, subject headings, and restrictions.Except for one study (Talpur et al. 2022), which did not specify the instrument used for quality appraisal, quality assessments of the included studies were consistently completed, with the standard tool utilized by two or more reviewers independently, while Talpur et al. (2022) review did not adhere to this standard procedure.Four reviews included at least two experts who separately extracted data using a structured data extraction sheet to reduce bias and errors in data extraction.However, as no review included a meta-analysis, none of the reviews evaluated the likelihood of publication bias.Except for Revilla-León et al. (2022), all investigations offered pertinent, practical implications for their conclusions.All studies presented pertinent future research implications based on their findings, apart from Khanagar et al. (2022) and Mohammad-Rahimi et al. (2022).
Overall, there was a low risk of bias in the included reviews (Figure 3).

| Discussion
This umbrella review included seven systematic reviews to summarize the application and effectiveness of AI models in dental caries detection and diagnosis.A protocol's development and registration are critical components of a high-quality systematic review because they provide a clear roadmap for the study's objectives, methods, and intended analysis.Unfortunately, five of the studies considered in this study did not have their protocols registered in any public repository, which raises issues regarding potential bias and selective reporting.A registered protocol not only promotes transparency but also safeguards against the possibility of outcome reporting bias, unanticipated work repetition, and resource waste, ultimately enhancing scientific standards (Johnston et al. 2019;Stewart, Moher, and Shekelle 2012).To further guarantee the caliber and thoroughness of systematic reviews, adherence to wellestablished reporting protocols, such as PRISMA and PRISMA-DTA, is essential (Gagnier and Kellam 2013).Accurate judgments and the correct interpretation of the findings and conclusions might be hampered by inadequate reporting, which can cloud the review's conduct (Frank, Bossuyt, and McInnes 2018;Moher, Stewart, and Shekelle 2016).To increase the accuracy and openness of their systematic reviews and eventually benefit the larger scientific community, researchers must insist on protocol registration and standard guideline compliance.
The PICO model was created to aid in the structure of a wellconstructed research question to facilitate a literature search.It served a significant role as a conceptualizing structure for     2021), the use of AI for caries diagnosis is expected to be cost-effective, particularly in terms of reducing undiscovered lesions.
A critical key finding from the study reveals the greater accuracy of periapical and panoramic radiography in detecting dental caries.It matches with the conclusions reached by Turosz et al. (2023) in their umbrella review of AI applications in panoramic radiograph analysis.Their review emphasized that AI applications can significantly assist dentists in analyzing dental panoramic radiographs, especially in dental caries, with an incredible precision rate of 91.5%.Furthermore, Oztekin et al. ( 2023) used panoramic radiography pictures to test three deep learning-driven models with prior training for automated dental caries detection.Among these models, ResNet-50, an image classification model consisting of 50 CNN layers, performed best, with an accuracy of 92.00%, sensitivity of 87.33%, and F1-score of 91.61%, demonstrating its efficacy in caries detection.However, ethical and mindful adoption is needed for the incorporation of AI in dentistry (Anil, Porwal, and Porwal 2023).
In addition, the widespread availability of high-quality AI solutions in dentistry could culminate in the emergence of a new professional area centered on AI-driven detection and treatments (Karobari et al. 2023).However, the emphasis should go beyond sample sizes to emphasize its generalizability and reproducibility.Moreover, transparency must be ensured by researchers by sharing both data and algorithm codes, allowing for rigorous validation, and increasing trust in AI-driven dentistry diagnoses and therapies.Despite these prospects and strengths of AI in dental caries diagnosis and detection, considerable problems remain in sharing data and management.Personal patient data are required for the training, validation, and enhancement of AI algorithms, which necessitates data sharing between institutions or beyond national boundaries.Data security is therefore essential for the successful integration of AI into clinical practice.Mechanisms are required to guarantee the accuracy of AI algorithms and resolve the question of who is responsible for errors made by the technology.The transition from human to autonomous agents presents substantial ethical and legal issues, raising doubts about our legal system's capacity to adjust to the changing role of AI.Besides, data quality and AI algorithm transparency are essential since inadequate labeling can hinder AI's effectiveness in dentistry.For AI to make clear healthcare decisions, interpretability must be improved through research (Joda et al. 2019).

| Limitations and Future Recommendations
The fundamental drawback of this study derives from the wide variance in AI algorithms, data sources, and performance measures observed among the included experiments.CNNs, ANNs, RetinaNet, VGG models, and other techniques were used to analyze data from several sources, including periapical radiographs, smartphone images, bitewing radiographs, and NILT images.Furthermore, classifier effectiveness was evaluated using various measures, such as accuracy, sensitivity, specificity, ROC analysis, PPV, and NPV.Owing to this wider heterogeneity, statistical analysis of the data was impossible for this umbrella review.The presence of primary studies that may have been duplicated throughout the included reviews is another key limitation of this umbrella review.Because of this overlap in original research across included systematic reviews, published classifier performance ranges may be misleading.Moreover, our umbrella review relies on existing systematic reviews and meta-analyses, so there is a risk of potentially excluding the latest evidence from studies not yet included in such analyses (Gianfredi et al. 2022).
Future umbrella reviews must include a thorough screening strategy to uncover and reject duplicate studies, ensuring that the results are based on distinct data sets and findings.Moreover, Future research should focus on ensuring consistency in AI algorithms, data formats, and evaluation metrics to facilitate direct comparison Besides this, there is a great deal of potential for using smartphones to take dental imaging pictures.However, this has rarely been studied by researchers so far.Future studies must test the incorporation of AI technologies in the identification of dental caries using these devices to extract crucial factors from dental photos, thus, increasing the availability of dental health evaluations and enhancing the precision and application of AI-based caries detection technologies.Furthermore, to assess the real-time clinical implication of AI models, longitudinal studies in real-world settings should be carried out.An interdisciplinary collaboration between experts from dentistry, computer science, and radiology as well as regulatory bodies to meet end-users' demands along with safety, accuracy, and reliability standards is crucial for advancing the ethical integration of AI in dental practice.

| Conclusion
AI models, especially CNN-based models, have an enormous amount of potential for accurate, objective dental caries diagnosis and detection.This emphasizes the promising role of AI in improving the precision and effectiveness of caries detection and diagnosis in dentistry practice, with the potential to enhance patient outcomes and streamline healthcare procedures.However, ensuring end-users' demands along with safety, accuracy, and reliability standards remain critical to its successful integration into routine practice.

P
(Population): Patient dental image data sets.I (Intervention): AI-based models or algorithms for dental caries diagnosis and detection.C (Comparator): Conventional methods/other algorithms/no comparator.

FIGURE 2 |
FIGURE 2 | Variety of dental images used across studies.
evidence-based healthcare(Eriksen and Frandsen 2018).The difference in PICO structure identified throughout the included systematic studies highlights the complexity and diversity of AI applications in dental caries.One noticeable distinction is the lack of clarity in the target population characteristics in some studies.For example, Revilla-León et al.(2022) defined the population as clinical applications in restorative dentistry for the identification of dental caries, whereas Prados-Privado et al. (2020) defined the population as neural networks and dental caries.This ambiguity might have an impact on the precision and usefulness of the results, underscoring the significance of accurately defining the population under study.In terms of comparators, expert judgment, clinical examination, or reference tests were the most commonly used, mirroring real-world clinical circumstances.However, the potential shortcomings and subjectivity associated with human judgments must be acknowledged.Talpur et al. (2022) provided an alternate strategy where machine learning approaches were used as comparators, offering a more objective basis for comparison, which can help alleviate the bias inherent in expert judgment.Furthermore, the reported result indicators varied greatly.Moharrami et al. (2024), for example, used the F1 score-a performance matric that combinedly indicates precision and recall of classifier (Alakus and Turkoglu 2020) while Khanagar et al. (2022) used various metrics.Because of the heterogeneity in performance standards, we could not perform meta-analyses and make conclusive findings.Thus, future research should develop shared frameworks and standardized measurements to allow for clearer comparisons across studies and contribute to a more thorough knowledge of the topic.In terms of key findings, CNNs-a class of deep neural networks used for image classification and computer vision tasks, consistently demonstrated high levels of specificity, sensitivity, and AUC values, all of which are critical for accurate detection, diagnosis, and prediction of dental caries.Notably, Talpur et al.(2022) found that the neural network backpropagation algorithm (an algorithm to train artificial neural networks) was the best choice for dental image data sets, obtaining a remarkable accuracy rate of up to 99%.In the research done byDayı et al. (2023) different CNN-based algorithms, such as MobileNetV2, VGG16, ResNet50, EfficientNet, and Inception network architectures, were assessed for their ability to detect occlusal, proximal, and cervical caries.The results of the study demonstrated the effectiveness of AI systems based on deep learning, particularly the ResNet50-DCDNet network, which correctly identified occlusal and proximal caries and received the highest F1 score of 62.7%.These findings highlight how deep learning-based methods and CNNs have the potential to dramatically improve the precision and effectiveness of dental caries identification, opening interesting new directions for dental diagnostics and patient care.Zhu et al. (2022) tested CariesNet, a CNN-based architectural innovation, to interpret dental radiographs using AI and machine learning.In comparison to conventional manual inspection, it obtains a remarkable accuracy rate of 93.61% in detecting the presence of dental caries.These findings highlight CNNs' tremendous potential for improving the precision and efficiency of caries identification and diagnosis in dental practice.This development has the potential to improve not only patient outcomes but also healthcare processes.Furthermore, as shown bySchwendicke et al. (

FIGURE 3 |
FIGURE 3 | Quality appraisal of included reviews.

TABLE 1 |
Meta-data of studies.

TABLE 2 |
PICO format of included studies.

TABLE 3 |
AI performance indicators in included studies and significance/direction.